The Stan Math Library: Reverse-Mode Automatic Differentiation in C++

نویسندگان

Bob Carpenter

Matthew D. Hoffman

Marcus Brubaker

Daniel Lee

Peter Li

Michael Betancourt

چکیده

As computational challenges in optimization and statistical inference grow ever harder, algorithms that utilize derivatives are becoming increasingly more important. The implementation of the derivatives that make these algorithms so powerful, however, is a substantial user burden and the practicality of these algorithms depends critically on tools like automatic differentiation that remove the implementation burden entirely. The Stan Math Library is a C++, reverse-mode automatic differentiation library designed to be usable, extensive and extensible, efficient, scalable, stable, portable, and redistributable in order to facilitate the construction and utilization of such algorithms. Usability is achieved through a simple direct interface and a cleanly abstracted functional interface. The extensive built-in library includes functions for matrix operations, linear algebra, differential equation solving, and most common probability functions. Extensibility derives from a straightforward object-oriented framework for expressions, allowing users to easily create custom functions. Efficiency is achieved through a combination of custom memory management, subexpression caching, traits-based metaprogramming, and expression templates. Partial derivatives for compound functions are evaluated lazily for improved scalability. Stability is achieved by taking care with arithmetic precision in algebraic expressions and providing stable, compound functions where possible. For portability, the library is standards-compliant ar X iv :1 50 9. 07 16 4v 1 [ cs .M S] 2 3 Se p 20 15 C++ (03) and has been tested for all major compilers for Windows, Mac OS X, and Linux. It is distributed under the new BSD license. This paper provides an overview of the Stan Math Library’s application programming interface (API), examples of its use, and a thorough explanation of how it is implemented. It also demonstrates the efficiency and scalability of the Stan Math Library by comparing its speed and memory usage of gradient calculations to that of several popular open-source C++ automatic differentiation systems (Adept, Adol-C, CppAD, and Sacado), with results varying dramatically according to the type of expression being differentiated. 1. Reverse-Mode Automatic Differentiation Many contemporary algorithms require the evaluation of a derivative of a given differentiable function, f , at a given input value, (x1, . . . , xN), for example a gradient, ( ∂f ∂x1 (x1, . . . , xN) , · · · , ∂f ∂xN (x1, . . . , xN) ) , or a directional derivative, ~v(f) (x1, . . . , xN) = N ∑ n=1 vn ∂f ∂xn (x1, . . . , xN) . Automatic differentiation computes these values automatically, using only a representation of f as a computer program. For example, automatic differentiation can take a simple C++ expression such as x * y / 2 with inputs x = 6 and y = 4 and produce both the output value, 12, and the gradient, (2, 3). Automatic differentiation is implemented in practice by transforming the subexpressions in the given computer program into nodes of an expression graph (see Figure 1, below, for an example), and then propagating chain rule evaluations along these nodes (Griewank and Walther, 2008; Giles, 2008). In forward-mode automatic differentiation, each node k in the graph contains both a value xk and a tangent, tk, which represents the directional derivative of xk with respect to the input variables. The tangent values for the input values are initialized with values ~v, because that represents the appropriate directional derivative of each input variable. The complete set of tangent values is calculated by propagating tangents forward from the inputs to the outputs with the rule ti = ∑ j∈children[i] ∂xi ∂xj tj. A special case of a directional derivative computes derivatives with respect to a single variable by setting ~v to a vector with a value of 1 for the single distinguished variable and 0 for all other variables.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A sparse matrix approach to reverse mode automatic differentiation in Matlab

We review the extended Jacobian approach to automatic differentiation of a user-supplied function and highlight the Schur complement form’s forward and reverse variants. We detail a Matlab operator overloaded approach to construct the extended Jacobian that enables the function Jacobian to be computed using Matlab’s sparse matrix operations. Memory and runtime costs are reduced using a variant ...

متن کامل

A Reverse-Mode Automatic Differentiation in Haskell Using the Accelerate Library

Automatic Differentiation is a method for applying differentiation strategies to source code, by taking a computer program and deriving from that program a separate program which calculates the derivatives of the output of the first program. Because of this, Automatic Differentiation is of vital importance to most deep learning tasks as it allows for the easy backpropogation of complex calculat...

متن کامل

The Study of Automatic and Controlled Data Processing Speed Based on the Stroop Test in Students with Math Learning Disability

Introduction: The study of individual differences in information processing in order to predict the academic achievement of students with math disability is of great importance. The purpose of this study was to study automatic and controlled data processing speed based on the Stroop test in students with math learning disability. Materials and Methods: This descriptive study was causal-comparat...

متن کامل

An Overview of High Order Reverse Mode

Automatic Differentiation (AD) is increasingly an important component of Machine Learning (ML) packages. For evaluating the gradient, the first order reverse mode also known as back-propagation, is optimal and is widely used. However, the functionalities of current mainstream ML packages for evaluating second and higher order derivatives are limited. One reason is that high order derivatives ar...

متن کامل

Automatic Differentiation and Backpropagation CS701

This lecture discusses the relationship between automatic differentiation and backpropagation. Automatic differentiation (AD) is a technique that takes an implementation of a numerical function f (computed using floating-point numbers) and creates an implementation of f . We explain several techniques for performing AD. For forward-mode AD, we give an explicit transformation of the program, as ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1509.07164 شماره

صفحات -

تاریخ انتشار 2015

The Stan Math Library: Reverse-Mode Automatic Differentiation in C++

نویسندگان

چکیده

منابع مشابه

A sparse matrix approach to reverse mode automatic differentiation in Matlab

A Reverse-Mode Automatic Differentiation in Haskell Using the Accelerate Library

The Study of Automatic and Controlled Data Processing Speed Based on the Stroop Test in Students with Math Learning Disability

An Overview of High Order Reverse Mode

Automatic Differentiation and Backpropagation CS701

عنوان ژورنال:

اشتراک گذاری